Overview

Dataset statistics

Number of variables14
Number of observations88258
Missing cells361726
Missing cells (%)29.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 MiB
Average record size in memory89.0 B

Variable types

Numeric7
Categorical2
Boolean1
Unsupported3
DateTime1

Alerts

Pedido_pendiente_de_suministro_insuficiente has constant value "True" Constant
Fecha_de_pedido has a high cardinality: 1512 distinct values High cardinality
Fecha_de_entrega_esperada has a high cardinality: 630 distinct values High cardinality
ID_de_pedido is highly correlated with ID_de_pedido_pendienteHigh correlation
ID_de_cliente is highly correlated with ID_de_persona_de_contactoHigh correlation
ID_de_persona_de_contacto is highly correlated with ID_de_clienteHigh correlation
ID_de_pedido_pendiente is highly correlated with ID_de_pedidoHigh correlation
ID_de_pedido is highly correlated with ID_de_pedido_pendienteHigh correlation
ID_de_cliente is highly correlated with ID_de_persona_de_contactoHigh correlation
ID_de_persona_de_contacto is highly correlated with ID_de_clienteHigh correlation
ID_de_pedido_pendiente is highly correlated with ID_de_pedidoHigh correlation
ID_de_pedido is highly correlated with ID_de_pedido_pendienteHigh correlation
ID_de_cliente is highly correlated with ID_de_persona_de_contactoHigh correlation
ID_de_persona_de_contacto is highly correlated with ID_de_clienteHigh correlation
ID_de_pedido_pendiente is highly correlated with ID_de_pedidoHigh correlation
ID_de_pedido is highly correlated with ID_de_pedido_pendienteHigh correlation
ID_de_cliente is highly correlated with ID_de_persona_de_contactoHigh correlation
ID_de_persona_de_contacto is highly correlated with ID_de_clienteHigh correlation
ID_de_pedido_pendiente is highly correlated with ID_de_pedidoHigh correlation
Seleccionado_por_ID_de_persona has 13570 (15.4%) missing values Missing
ID_de_pedido_pendiente has 79035 (89.5%) missing values Missing
Comentarios has 88258 (100.0%) missing values Missing
Instrucciones_de_entrega has 88258 (100.0%) missing values Missing
Comentarios_internos has 88258 (100.0%) missing values Missing
Seleccion_completada_cuando has 4347 (4.9%) missing values Missing
ID_de_pedido is uniformly distributed Uniform
Comentarios is an unsupported type, check if it needs cleaning or further analysis Unsupported
Instrucciones_de_entrega is an unsupported type, check if it needs cleaning or further analysis Unsupported
Comentarios_internos is an unsupported type, check if it needs cleaning or further analysis Unsupported

Reproduction

Analysis started2024-06-19 02:23:03.840922
Analysis finished2024-06-19 02:26:39.739229
Duration3 minutes and 35.9 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

ID_de_pedido
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct54145
Distinct (%)61.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46547.75124
Minimum19443
Maximum73595
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size344.9 KiB
2024-06-18T21:26:39.852137image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum19443
5-th percentile22149
Q133072
median46546
Q360071
95-th percentile70898.15
Maximum73595
Range54152
Interquartile range (IQR)26999

Descriptive statistics

Standard deviation15608.64492
Coefficient of variation (CV)0.3353254348
Kurtosis-1.196202414
Mean46547.75124
Median Absolute Deviation (MAD)13502
Skewness-0.0007785763955
Sum4108211429
Variance243629796.3
MonotonicityNot monotonic
2024-06-18T21:26:40.027912image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
422453
 
< 0.1%
508813
 
< 0.1%
639703
 
< 0.1%
536763
 
< 0.1%
258503
 
< 0.1%
729923
 
< 0.1%
530833
 
< 0.1%
478753
 
< 0.1%
342513
 
< 0.1%
626773
 
< 0.1%
Other values (54135)88228
> 99.9%
ValueCountFrequency (%)
194431
< 0.1%
194441
< 0.1%
194451
< 0.1%
194461
< 0.1%
194471
< 0.1%
194481
< 0.1%
194492
< 0.1%
194501
< 0.1%
194512
< 0.1%
194521
< 0.1%
ValueCountFrequency (%)
735952
< 0.1%
735942
< 0.1%
735932
< 0.1%
735922
< 0.1%
735913
< 0.1%
735901
 
< 0.1%
735891
 
< 0.1%
735882
< 0.1%
735871
 
< 0.1%
735862
< 0.1%

ID_de_cliente
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct663
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean534.5820549
Minimum1
Maximum1061
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size344.9 KiB
2024-06-18T21:26:40.184108image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile33
Q1162
median521
Q3883
95-th percentile1012
Maximum1061
Range1060
Interquartile range (IQR)721

Descriptive statistics

Standard deviation345.8665216
Coefficient of variation (CV)0.646984908
Kurtosis-1.42979991
Mean534.5820549
Median Absolute Deviation (MAD)360
Skewness-0.06455180167
Sum47181143
Variance119623.6508
MonotonicityNot monotonic
2024-06-18T21:26:40.356312image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
472182
 
0.2%
548175
 
0.2%
183174
 
0.2%
70173
 
0.2%
964173
 
0.2%
405173
 
0.2%
804172
 
0.2%
67172
 
0.2%
169171
 
0.2%
118170
 
0.2%
Other values (653)86523
98.0%
ValueCountFrequency (%)
1159
0.2%
2152
0.2%
3151
0.2%
4117
0.1%
5149
0.2%
6129
0.1%
7134
0.2%
8120
0.1%
9163
0.2%
10157
0.2%
ValueCountFrequency (%)
106116
 
< 0.1%
10606
 
< 0.1%
105911
 
< 0.1%
105820
< 0.1%
105726
< 0.1%
105621
< 0.1%
105545
0.1%
105447
0.1%
105340
< 0.1%
105249
0.1%

ID_de_vendedor
Real number (ℝ≥0)

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.42955879
Minimum2
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size344.9 KiB
2024-06-18T21:26:40.481115image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q16
median13
Q315
95-th percentile20
Maximum20
Range18
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.712593445
Coefficient of variation (CV)0.5477310745
Kurtosis-1.251630752
Mean10.42955879
Median Absolute Deviation (MAD)5
Skewness0.03193206922
Sum920492
Variance32.63372386
MonotonicityNot monotonic
2024-06-18T21:26:40.590361image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
169008
10.2%
158924
10.1%
138921
10.1%
28872
10.1%
148851
10.0%
68806
10.0%
208779
9.9%
38728
9.9%
88690
9.8%
78679
9.8%
ValueCountFrequency (%)
28872
10.1%
38728
9.9%
68806
10.0%
78679
9.8%
88690
9.8%
138921
10.1%
148851
10.0%
158924
10.1%
169008
10.2%
208779
9.9%
ValueCountFrequency (%)
208779
9.9%
169008
10.2%
158924
10.1%
148851
10.0%
138921
10.1%
88690
9.8%
78679
9.8%
68806
10.0%
38728
9.9%
28872
10.1%

Seleccionado_por_ID_de_persona
Real number (ℝ≥0)

MISSING

Distinct19
Distinct (%)< 0.1%
Missing13570
Missing (%)15.4%
Infinite0
Infinite (%)0.0%
Mean10.95699443
Minimum2
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size689.6 KiB
2024-06-18T21:26:40.730982image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q16
median11
Q316
95-th percentile19
Maximum20
Range18
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.460278195
Coefficient of variation (CV)0.4983372247
Kurtosis-1.233665779
Mean10.95699443
Median Absolute Deviation (MAD)5
Skewness0.01761212988
Sum818356
Variance29.81463797
MonotonicityNot monotonic
2024-06-18T21:26:40.856010image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
185022
 
5.7%
84851
 
5.5%
74575
 
5.2%
34469
 
5.1%
174375
 
5.0%
44276
 
4.8%
104072
 
4.6%
94000
 
4.5%
63862
 
4.4%
143786
 
4.3%
Other values (9)31400
35.6%
(Missing)13570
15.4%
ValueCountFrequency (%)
23448
3.9%
34469
5.1%
44276
4.8%
53022
3.4%
63862
4.4%
74575
5.2%
84851
5.5%
94000
4.5%
104072
4.6%
113437
3.9%
ValueCountFrequency (%)
203348
3.8%
193603
4.1%
185022
5.7%
174375
5.0%
163716
4.2%
153734
4.2%
143786
4.3%
133545
4.0%
123547
4.0%
113437
3.9%

ID_de_persona_de_contacto
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct663
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2234.913606
Minimum1001
Maximum3261
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size344.9 KiB
2024-06-18T21:26:41.012224image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile1065
Q11323
median2241
Q33083
95-th percentile3212
Maximum3261
Range2260
Interquartile range (IQR)1760

Descriptive statistics

Standard deviation800.280987
Coefficient of variation (CV)0.3580813974
Kurtosis-1.475072091
Mean2234.913606
Median Absolute Deviation (MAD)868
Skewness-0.1753608184
Sum197249005
Variance640449.6582
MonotonicityNot monotonic
2024-06-18T21:26:41.168476image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2143182
 
0.2%
2295175
 
0.2%
1365174
 
0.2%
1139173
 
0.2%
3164173
 
0.2%
2009173
 
0.2%
3004172
 
0.2%
1133172
 
0.2%
1337171
 
0.2%
1235170
 
0.2%
Other values (653)86523
98.0%
ValueCountFrequency (%)
1001159
0.2%
1003152
0.2%
1005151
0.2%
1007117
0.1%
1009149
0.2%
1011129
0.1%
1013134
0.2%
1015120
0.1%
1017163
0.2%
1019157
0.2%
ValueCountFrequency (%)
326116
 
< 0.1%
32606
 
< 0.1%
325911
 
< 0.1%
325820
< 0.1%
325726
< 0.1%
325621
< 0.1%
325545
0.1%
325447
0.1%
325340
< 0.1%
325249
0.1%

ID_de_pedido_pendiente
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5666
Distinct (%)61.4%
Missing79035
Missing (%)89.5%
Infinite0
Infinite (%)0.0%
Mean46971.11081
Minimum19532
Maximum73595
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size689.6 KiB
2024-06-18T21:26:41.340367image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum19532
5-th percentile22430.1
Q133756.5
median47272
Q360249.5
95-th percentile71019.9
Maximum73595
Range54063
Interquartile range (IQR)26493

Descriptive statistics

Standard deviation15466.66876
Coefficient of variation (CV)0.3292804554
Kurtosis-1.180305062
Mean46971.11081
Median Absolute Deviation (MAD)13273
Skewness-0.02223146013
Sum433214555
Variance239217842.5
MonotonicityNot monotonic
2024-06-18T21:26:41.496608image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
276303
 
< 0.1%
715863
 
< 0.1%
219673
 
< 0.1%
263203
 
< 0.1%
712123
 
< 0.1%
432443
 
< 0.1%
393703
 
< 0.1%
522093
 
< 0.1%
722773
 
< 0.1%
645933
 
< 0.1%
Other values (5656)9193
 
10.4%
(Missing)79035
89.5%
ValueCountFrequency (%)
195323
< 0.1%
195331
 
< 0.1%
195341
 
< 0.1%
195352
< 0.1%
195361
 
< 0.1%
195372
< 0.1%
195382
< 0.1%
195392
< 0.1%
195902
< 0.1%
195912
< 0.1%
ValueCountFrequency (%)
735953
< 0.1%
735942
< 0.1%
735932
< 0.1%
735922
< 0.1%
735912
< 0.1%
735901
 
< 0.1%
735891
 
< 0.1%
735881
 
< 0.1%
735871
 
< 0.1%
735052
< 0.1%

Fecha_de_pedido
Categorical

HIGH CARDINALITY

Distinct1512
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size689.6 KiB
2015-02-03
 
163
2016-01-06
 
163
2015-10-19
 
162
2016-03-23
 
159
2015-07-06
 
158
Other values (1507)
87453 

Length

Max length11
Median length10
Mean length10.22700492
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMar 05,2015
2nd rowMar 24,2015
3rd rowMar 18,2014
4th rowApr 21,2015
5th rowNov 11,2014

Common Values

ValueCountFrequency (%)
2015-02-03163
 
0.2%
2016-01-06163
 
0.2%
2015-10-19162
 
0.2%
2016-03-23159
 
0.2%
2015-07-06158
 
0.2%
2015-11-24158
 
0.2%
2016-04-28155
 
0.2%
2016-05-04155
 
0.2%
2015-02-27155
 
0.2%
2015-07-30154
 
0.2%
Other values (1502)86676
98.2%

Length

2024-06-18T21:26:41.652849image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
may2187
 
2.0%
apr2148
 
2.0%
jan2108
 
1.9%
mar1951
 
1.8%
feb1858
 
1.7%
jul1576
 
1.5%
oct1467
 
1.4%
jun1420
 
1.3%
dec1412
 
1.3%
sep1357
 
1.3%
Other values (851)90809
83.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Fecha_de_entrega_esperada
Categorical

HIGH CARDINALITY

Distinct630
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size689.6 KiB
2015-03-02
 
287
2016-05-16
 
283
2016-01-11
 
266
2014-04-28
 
265
2015-06-08
 
264
Other values (625)
86893 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015-03-06
2nd row2015-03-25
3rd row2014-03-19
4th row2015-04-22
5th row2014-11-12

Common Values

ValueCountFrequency (%)
2015-03-02287
 
0.3%
2016-05-16283
 
0.3%
2016-01-11266
 
0.3%
2014-04-28265
 
0.3%
2015-06-08264
 
0.3%
2015-06-01263
 
0.3%
2015-09-14263
 
0.3%
2015-04-06262
 
0.3%
2016-04-11259
 
0.3%
2015-03-30258
 
0.3%
Other values (620)85588
97.0%

Length

2024-06-18T21:26:41.777839image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2015-03-02287
 
0.3%
2016-05-16283
 
0.3%
2016-01-11266
 
0.3%
2014-04-28265
 
0.3%
2015-06-08264
 
0.3%
2015-06-01263
 
0.3%
2015-09-14263
 
0.3%
2015-04-06262
 
0.3%
2016-04-11259
 
0.3%
2015-03-30258
 
0.3%
Other values (620)85588
97.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct9926
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14993.60438
Minimum10001
Maximum20000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size689.6 KiB
2024-06-18T21:26:41.918461image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum10001
5-th percentile10505
Q112493
median14995
Q317496
95-th percentile19492
Maximum20000
Range9999
Interquartile range (IQR)5003

Descriptive statistics

Standard deviation2882.03311
Coefficient of variation (CV)0.1922174974
Kurtosis-1.198478346
Mean14993.60438
Median Absolute Deviation (MAD)2501.5
Skewness0.001300334427
Sum1323305535
Variance8306114.85
MonotonicityNot monotonic
2024-06-18T21:26:42.074720image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1535832
 
< 0.1%
1978630
 
< 0.1%
1077629
 
< 0.1%
1575829
 
< 0.1%
1665128
 
< 0.1%
1485628
 
< 0.1%
1265027
 
< 0.1%
1988527
 
< 0.1%
1814426
 
< 0.1%
1165826
 
< 0.1%
Other values (9916)87976
99.7%
ValueCountFrequency (%)
100017
 
< 0.1%
1000220
< 0.1%
1000320
< 0.1%
1000415
< 0.1%
1000515
< 0.1%
100066
 
< 0.1%
1000722
< 0.1%
100084
 
< 0.1%
1000921
< 0.1%
100104
 
< 0.1%
ValueCountFrequency (%)
200009
< 0.1%
199991
 
< 0.1%
1999810
< 0.1%
199974
 
< 0.1%
1999616
< 0.1%
1999510
< 0.1%
199949
< 0.1%
1999312
< 0.1%
199927
< 0.1%
199918
< 0.1%
Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size86.3 KiB
True
88258 
ValueCountFrequency (%)
True88258
100.0%
2024-06-18T21:26:42.199717image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Comentarios
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing88258
Missing (%)100.0%
Memory size689.6 KiB

Instrucciones_de_entrega
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing88258
Missing (%)100.0%
Memory size689.6 KiB

Comentarios_internos
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing88258
Missing (%)100.0%
Memory size689.6 KiB
Distinct1506
Distinct (%)1.8%
Missing4347
Missing (%)4.9%
Memory size689.6 KiB
Minimum2014-01-01 11:00:00
Maximum2016-05-31 12:00:00
2024-06-18T21:26:42.277840image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:26:42.449714image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-06-18T21:24:52.559509image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:09.274196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:28.503177image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:47.693517image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:07.590813image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:26.696737image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:46.302634image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:25:06.716157image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:09.466466image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:28.678519image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:47.873493image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:07.766176image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:26.872950image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:46.448362image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:25:18.606379image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:09.647948image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:28.848632image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:48.055108image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:07.942934image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:27.046871image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:46.599705image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:25:33.387660image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:09.833717image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:29.027742image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:48.239419image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:08.133917image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:27.220906image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:46.751910image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:25:45.262690image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:10.035280image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:29.219335image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:48.431376image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:08.337627image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:27.407916image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:46.873390image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:26:00.566740image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:10.181329image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:29.363074image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:48.576254image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:08.492933image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:27.548910image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:47.015168image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:26:05.200112image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:10.450701image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:29.606660image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:23:48.841461image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:08.732308image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:27.809727image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2024-06-18T21:24:47.192190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2024-06-18T21:26:42.605967image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2024-06-18T21:26:42.824732image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2024-06-18T21:26:43.027835image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2024-06-18T21:26:43.262223image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2024-06-18T21:26:38.283925image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-06-18T21:26:38.731293image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-06-18T21:26:39.246611image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2024-06-18T21:26:39.436325image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

ID_de_pedidoID_de_clienteID_de_vendedorSeleccionado_por_ID_de_personaID_de_persona_de_contactoID_de_pedido_pendienteFecha_de_pedidoFecha_de_entrega_esperadaNumero_de_pedido_de_compra_del_clientePedido_pendiente_de_suministro_insuficienteComentariosInstrucciones_de_entregaComentarios_internosSeleccion_completada_cuando
044495139618.01277NaNMar 05,20152015-03-0613118TrueNoneNoneNone2015-03-06 11:00:00
1456258921517.03092NaNMar 24,20152015-03-2511303TrueNoneNoneNone2015-03-24 11:00:00
22364554289.02283NaNMar 18,20142014-03-1918340TrueNoneNoneNone2014-03-18 11:00:00
347505924154.03124NaNApr 21,20152015-04-2218645TrueNoneNoneNone2015-04-21 11:00:00
43775543710.01085NaNNov 11,20142014-11-1216259TrueNoneNoneNone2014-11-11 11:00:00
53611454310.01107NaNOct 16,20142014-10-1717861TrueNoneNoneNone2014-10-16 11:00:00
64842395639.03156NaNMay 04,20152015-05-0519616TrueNoneNoneNone2015-09-24 11:00:00
73846510281612.03228NaNNov 24,20142014-11-2518936TrueNoneNoneNone2014-11-24 11:00:00
848651392NaN1077NaNMay 07,20152015-05-0813944TrueNoneNoneNoneNaT
9340802583.01049NaNSep 10,20142014-09-1111079TrueNoneNoneNone2014-09-10 11:00:00

Last rows

ID_de_pedidoID_de_clienteID_de_vendedorSeleccionado_por_ID_de_personaID_de_persona_de_contactoID_de_pedido_pendienteFecha_de_pedidoFecha_de_entrega_esperadaNumero_de_pedido_de_compra_del_clientePedido_pendiente_de_suministro_insuficienteComentariosInstrucciones_de_entregaComentarios_internosSeleccion_completada_cuando
88248633428943NaN309463377.02015-12-222015-12-2310582TrueNoneNoneNone2015-12-22 12:00:00
882493785194163.01187NaN2014-11-132014-11-1412114TrueNoneNoneNone2014-11-13 11:00:00
8825070517830168.03030NaN2016-04-152016-04-1815939TrueNoneNoneNone2016-04-15 11:00:00
8825131919575211.02349NaN2014-08-012014-08-0411877TrueNoneNoneNone2014-08-01 11:00:00
88252580451226NaN124358116.02015-09-292015-09-3018016TrueNoneNoneNone2015-09-29 12:00:00
88253715398971418.03097NaN2016-04-302016-05-0218811TrueNoneNoneNone2016-04-30 11:00:00
882544745398134.03181NaN2015-04-212015-04-2219539TrueNoneNoneNone2015-04-21 11:00:00
8825558181869320.03069NaN2015-09-302015-10-0119442TrueNoneNoneNone2015-09-30 11:00:00
8825657068892619.03092NaN2015-09-142015-09-1519090TrueNoneNoneNone2015-09-14 11:00:00
8825734340702NaN1139NaN2014-09-152014-09-1619896TrueNoneNoneNoneNaT